home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
HPAVC
/
HPAVC CD-ROM.iso
/
ARASAN_S.ZIP
/
README
< prev
next >
Wrap
Text File
|
1994-08-15
|
29KB
|
583 lines
Arasan - Chess for Windows - version 1.2
Programmer's Guide
by Jon Dart
Arasan is a chess program for Windows 3.1. It is copyrighted
but under terms that allow free distribution for non-commercial
purposes.
This archive contains the source code for Arasan. I am providing
the source code for the use of programmers who may wish to
understand how the program works or to modify it for their own use.
However, you may not distribute a modified version of this program
without the consent of the original author.
Arasan is written in C++ and was compiled with the Borland C++
compiler, version 4.0 (the source can also be compiled under
Borland C++ version 3.1). Arasan uses the Windows++ class library
by Paul DiLascia. Source for Windows++ is not included in this
distribution. It is available for $25 from:
Paul DiLascia
P.O. Box 391632
Cambridge, MA 02139
Note: I have no connection with Paul DiLascia. I am just a user of
his library. In a future version of Arasan I hope to convert to using
a different, more widely available class library such as MFC or OWL.
I have made a few bug fixes in the Windows++ source code; see the
file WPP.DIF for details of what was changed.
The remainder of this file contains information for use by programmers
reading or working on Arasan source code. I assume that you have a
working familiarity with C++, Windows programming, and the Borland
compiler and tools.
Building Arasan
I have attempted to make the source code for Arasan as portable as
possible, but I don't guarantee that it can be built with
any compiler except Borland C++. In particular, since it uses
templates, it cannot be compiled with Microsoft Visual C++ 1.0
or 1.5.
To compile Arasan, first edit the "make.cfg" file so that BCCDIR
points to your installation of Borland C++, and WPPDIR points to your
installation of Windows++. If you are using Borland C++ 3.1, you
will need to remove the -x- switch from the CFLAGS variable defined
in this file.
Then just type "make" from the source directory. Make will compile
and link the executable (arasan.exe). I recommend that you have a
minimum of 4 Megabytes of memory in order to build the source. If
you are compiling in a DOS box and run out of memory, try exiting
Windows and running "make" from DOS.
Opening book format
The opening book for Arasan is composed of two binary files, one for the
White moves ("BOOKW") and one for the black moves ("BOOKB"). At build
time, these files are bound into the executable by the resource compiler.
The format of the binary files is documented in makebook.cpp.
Since BOOKB and BOOKW are supplied with the source, you don't need to
create these files in order to build Arasan. However, if you want to
modify the contents of the opening book, you will need to edit the ASCII
source for the openings and rebuild BOOKB and BOOKW. Following is
information on how to do this.
The ASCII source for the opening book is contained in the file BOOK.TXT.
Moves are listed one or more per line in standard algebraic notation.
A move may optionally be followed by a number from 0 to 9. The
number, if given, is the "weight" assigned to the move. A weight of
0 means that the computer will never play the move, but will respond
to it if it is played by its opponent. However, if a weight of 0 is
specified after a non-zero weight for the same move in the same
position has already been specified, the original weight is
unchanged.
The higher the weight, the more often the computer will choose the
move compared to the other alternatives in the book. By default, if
no weight is specified, a weight of 5 is assigned. The supplied book
uses weights rather sparingly. In general, the computer has been
steered away from gambits, and away from openings it doesn't play
well. It is also biased towards playing the main line of an opening
as opposed to less common variants. Note: I am not a strong player,
and so the opening book is probably far from optimal. It is also not
very extensive: currently it contains aroud 10000 half-moves, mostly
from recent grandmaster games. Suggestions for expansion and/or
improvement of the opening book are quite welcome.
Some special symbols are allowed in the opening book. Any line
beginning with ";" is treated as a comment and ignored. An "m" in
the first column means that the present position should be
remembered. An "r" in the first column restores the position to
where it was when the last "m" was encountered. This allows easy
entry of variant lines from the same starting position. Only one
remembered position can be stored and retrieved, however: e.g. if you
specify "m" twice, the first position stored is discarded.
Building the opening book
To build the binary opening files from BOOK.TXT, you first need to
build the program MAKEBOOK.EXE. To do this, type "make -f
makebook.mak". MAKEBOOK is a DOS program that uses some of the same
files as the chess program. Since it must be built for DOS and not
Windows, make sure you remove any old object files before building
it, so that you don't get Windows objects mixed in with the DOS
objects.
Once MAKEBOOK has been built, just execute it and it will read BOOK.TXT
and produce BOOKB and BOOKW. Error messages are issued if any moves
are illegal or if there are syntax errors in the input file.
Testing support
Arasan includes several features to aid debugging. If you compile the
source with -DRANGE_CHECK, checks are inserted for accessing arrays
past their boundaries, as well as some other sanity checks. If any
of these checks fail, an error box will be displayed with the assertion
that failed.
For debugging purposes, it is also possible to build a DOS executable
that contains the Arasan search engine, but no user interface. This
executable is called "TESTSRC". To build TESTSRC, do "make -f testsrc.mak".
Since testsrc is a DOS executable but uses many of the same files as the
Windows executable, be sure to delete any old object files before building
TESTSRC. "make clean" will do this.
TESTSRC takes a command line that consists of one of two optional
switches followed by an optional filename. The two allowable
switches are:
-p <number> - performs a fixed-ply search to the indicated depth.
-t <number> - searches for the given number of seconds.
The filename, if specified, contains one or more board positions to be
analyzed, one per line. The position info is in EPD (Extended
Position Description) format. TESTSRC currently only recognizes a
small subset of EPD, sufficient to run the tests.
If the search module is compiled with -DTRACE, TESTSRC will print out
copious information about the search process when it is run. See
search.cpp to see what information is output and where it comes from.
Another compile-time debugging option is -DDEBUG_ATTACKS, which will
insert test code to check that the attack information for the board
is updated correctly when a move is made.
Because TESTSRC is a DOS program, it can easily be run from a batch
file to execute a series of tests. The following test suites are
provided with the source code (see the TESTS directory):
1. TESTS.EPD contains 22 fairly simple tactical problems. Arasan
should solve all these problems with a nominal 3-ply search.
2. WAC.EPD contains the 300 problems from Reinfeld's "Win At Chess"
book. These are mostly easy tactical problems. Arasan solves about
2/3 of them with a 60-second search limit on a Pentium 60.
3. The file BK.EPD contain the Bratko-Kopec series of tests. It is
standard procedure to allow the program 2 minutes for each test.
Some of these problems are quite difficult. Arasan currently solves
11 out of the 24 problems when run on a 60MHz Pentium.
4. The file ECE3100.EPD contains the first 100 problems from
Encyclopedia of Chess Endgames, volume 3. These are rook endgames.
5. Another endgame test suite is FINE.EPD, containing 48 king and
pawn endgame problems from Reuben Fine's Basic Chess Endings.
6. TYPP.EPD contains tests from Bellin and Ponzetto's book Test
Your Positional Play.
7. ECM.EPD contains test positions from the Encyclopedia of Chess
Middlegames. These are difficult middlegame problems.
The RESULTS file in the TESTS directory summarizes Arasan's performance
on these test suites. Details of the test runs can be found in the log
files, e.g. TESTS.LOG, BK.LOG, etc., which are also in the TESTS
directory.
Algorithms and data structures
1. The chess board
Following is some information about the algorithms and data
structures used by Arasan. If you are new to computer chess
programming, I suggest first reading a general work on the
subject such as Frey (1983) or Marsland and Schaeffer (1990).
The chess board in Arasan is represented by an array of 64 squares.
Each square contains 0 if it is empty, or a piece identifier if it is
occupied. Black pieces have identifier values between 1 and 6, while
White pieces have values between 9 and 15. (Note: the program used
to use a variant of the "mailbox" representation described by Frey,
with 120 squares. This allows easy detection of when a piece has
reached the board boundary. However, because stack space is a
limited resource in Windows, the representation was changed to use
only 64 squares). A special value (127) is used to represent a
square that is uninitialized or invalid.
The Board class also maintains a parallel board representation called
"PiecePos" consisting of two arrays of 16 entries each. These contain
lists of the squares occupied by pieces of each side. Any array
entry unoccupied by a piece is set to 127.
Several other pieces of information are stored in the Board class and
are updated for each move. The KingPos array holds the location of
each side's king. The PFileCount array holds a count of the number
of pawns on each file, for each side, and the RFileCount array holds
a similar value for rooks. The EnPassantSq array holds the square
position, for each side, on which an en passant capture is possible
(obviously, only the side to move can have a possible en passant
capture, but we maintain two values for programming convenience).
The CastleStatus array holds an enum for each side indicating whether
castling has occurred. Also, if the king or a rook has been moved,
making castling on one side or another impossible, CastleStatus is
set to an appropriate value.
Each board position also has a hash code associated with it. The
hash code is 32 bits and is computed by fetching, for each piece and
square combination, a unique 32-bit code from a table of random
numbers, and computing the exclusive or of these codes. The
low-order bit of the hash code is then set to identify whether White
or Black is to move. Castling status and en passant status are not
folded into the hash code, but are stored in a separate field in a
hash table entry - they must match the current board in order
for a position to be retrieved from the table.
Finally, the board position includes an array, for each side, of type
Attacks. This contains an integer for each square indicating which
pieces and of what kinds attack the square. This information is
incrementally updated (by the Attacks class) when a move is made or
unmade. Attacks does not store information about "stacked"
attackers, which may make some calculations involving this
information inaccurate.
An earlier chess program I wrote re-calculated complete attack
information for each position that was evaluated, but it wound up
spending a large fraction of its time in the attack calculation
procedure. Incrementally updating the attack information is
relatively cheap and probably faster.
2. Moves
There are three move classes used in Arasan. "Move" maintains only
the minimal information need to unambiguously specify a move: start
square, destination square, and promotion value. "ExtendedMove"
contains also the piece being moved, the piece being captured (if
any), and the type of move (normal, castling, en passant, etc).
Finally, "ReversibleMove" contains in addition the castling and en
passant status of the board before the move was made. Since this
information is needed for restoring the board position, only a
ReversibleMove can be "undone".
3. Move Generator
The move generation logic is mostly contained in the classes Bearing,
Move_Generator and Move_Ordering. Bearing contains static functions
to compute squares to which a piece can move. Except for pawn moves,
and for castling, move generation is done by table lookup. Each
class of piece has an associated table. Each table is indexed by
square number and side to move, and contains a list of squares to
which the piece could move (these tables are in beardata.h).
Additional checks are made to see that the destination square doesn't
contain a piece of the same color. However, moves into check are
possibly included.
The Move_Generation class uses functions from Bearing to generate all
moves for a given side. Moves into check are included, unless the
side to move is in check. In that case, a special function
(Check_Evasions) is called that strictly checks moves for equality.
It is very important to know whether any legal moves are possible
when in check: if there are none, the side to move is checkmated.
Also, some search extensions depend on the existence of a forced move
(one single legal move).
Move_Generation also calls the Move_Ordering class to re-order the
moves. At ply 0, some rather elaborate criteria are used for move
ordering: this includes a computation of each moves' positional
score, and also (for captures), a call to function attack_estimate,
which uses the attack information for the destination square to
estimate the gain from a capture. At ply 0, all moves are scored and
sorted. At higher plies, a less elaborate algorithm is used, which
moves the first few most promising moves to the start of the move
array, and then sorts only those. Captures are given a high priority
in the search order - higher if the capture appears to gain material,
but fairly high even if the capture apparently loses. Promotions are
also given special treatment.
Arasan also uses the "killer" heuristic. Moves that cause beta
cutoff in the search are stored in a killer array, and the move
ordering routine will give these moves a higher score than other
moves. But captures are generally given a higher score than "killer"
moves, so the killer moves have a relatively small effect on the
overall move ordering. The scoring algorithms in Move_Ordering have
been tuned to maximize search speed, mostly on the tactical problems
in the test suite, but there is doubtless more room for improvement.
4. Searching
Arasan uses an alpha-beta search algorithm with a variety of search
extensions. The search class is the largest single module in the
program, and is necessarily rather complicated, but I have tried to
structure it and comment it so that it is understandable. I will
assume that the reader knows the basics of the alpha-beta algorithm,
and will concentrate on describing this implementation of it.
In general, the search routine tries to terminate a search tree, or
some portion of one, as soon as possible, and will defer as much work
as possible until it is certain that no early and quicker termination
can be done. The techniques for doing this are mostly well-known and
there is nothing very original about the search algorithms used by
Arasan. However, as with most chess programs, there is a fine
balance between terminating a search too soon and extending it into
unprofitable and very unlikely lines of play. The precise nature of
this balance depends not only on the search algorithms used, but also
the relative efficiency of operations such as move generation,
position evaluation and move ordering. Each program therefore
strikes this balance in a somewhat different way.
The entry point for a search is a routine called find_best_move.
This function does some initialization, and then calls move_search,
which implements the alpha-beta search algorithm. The search
proceeds one ply (half move, i.e. move by one side) at a time. That
is, first a one-ply search is done, then a two-ply search, then
three, etc. until either the maximum ply limit has been reached or
the time control has been exceeded. Each search uses the results of
the preceeding search. The variable "max_depth" holds the current
nominal ply depth for the search. However, the presence of search
extensions means that some nodes may be searched to a greater depth
than this.
The first step in move_search is to look in the hash table (further
described in the next section), in order to see if an identical
position has been visited before. This may happen due to a
transposition of moves that lead to the same position, or because a
previous search to a shallower depth visited the same node. If a
hash table entry is found and if it contains a valid value (i.e. one
that did not cause cutoff), then that value is returned immediately
and no further searching from that node occurs. In other cases, the
hash table may not contain an exact value, but may hold an upper or
lower bound that can be used to narrow the alpha-beta window.
After the hash table check, a further check is made to see if the current
board position is drawn, due to insufficient material or to a 3-fold
repetition of moves. Arasan does not currently check for draws due to
the 50-move rule or variants of it.
If the position is drawn, move_search returns the negative of the
evaluation function. This penalizes the program for allowing a draw
when it is ahead in material or has a superior position.
If the search has reached the nominal search depth plus the maximum
possible search extension depth, or exceeded the maximum supported
ply depth, it also terminates immediately.
Normally the initial score for the position is set to -Constants::BIG
(BIG is a large integer used several places in the search process).
However, a different approach is taken when pursuing search
extensions, i.e. portions of the search tree beyond the nominal
search depth (max_depth). Three different types of search extension
are used in Arasan:
a. If the side to move is not in check, and if the search exceeds the
maximum depth, searching continues but includes only promotions and
capture moves that appear to gain material. There is a limit, set in
the Extensions structure, to how many additional plies may be
searched.
b. If the side to move is in check, the search is also extended, and
includes all legal replies, again up to the limit set in Extensions.
c. Finally, forced moves (moves with only one possible reply) cause
the search to be extended one ply and this ply is not counted towards
the computation of any other search limit.
In case a., which in the source code is called a "capture search,"
the current position is evaluated and searching may terminate if the
resulting value causes beta cutoff. The reasoning is that there is
no point continuing the search if the side to move can "stand pat"
without making any furture captures, and still obtain a value high
enough for cutoff. However, this early cutoff is inhibited if the
side to move has an apparently trapped piece, or appears to have
a "hung" piece (a piece subject to profitable capture).
If early cutoff doesn't occur, the next step is to try the "null
move.". The side to move is changed without altering the board
position and the opposing side is then allowed to move. Of course,
this could not occur in a game - a player is not allowed to "pass,"
but must move. However, the theory is that if the null move causes
cutoff, then the side to move must have a good position, since in
effect giving the opponent a free move still produces a high value
for the side to move. In this case, beta cutoff is allowed to occur
and no more searching is done from this node. This search
enhancement is enabled by compiling with -DNULL_MOVE.
If the null move does not cause cutoff, then the principal variation
move is searched. In the case of an initial search (e.g. a one-ply
search), the principal variation move is just the first move returned
by the move generator. Otherwise, at ply 0, it is the
highest-scoring move from the previous search iteration. This is
obtained by sorting the 0-ply moves from the last iteration, all of
which were stored along with their scores. At deeper plies, the hash
table is queried and if a best move has been stored for the position,
that move is tried first and is considered the principal variation.
The search first calls skip_move to see if the move should be
skipped. skip_move enforces the search extension limits mentioned
above. If the move is not to be skipped, skip_move returns "False",
and the search proceeds to call try_move. try_move will first make
the move, then query the attack info for the board to see if the side
to move is in check (remember, the move generator typically does not
exclude moves into check). If a move into check is found, the
special value Illegal is returned and the next move is tried. If the
move passes the legality check, then move_search is called again (it
is thus indirectly recursive).
If the return value from move_search did not cause cutoff, then the
search must be peformed again with a wider search window. try_move
takes care of this. try_move also checks the timer (if a timed search
is being done) and determines when to terminate the search. Usually
this occurs when 95% of the allotted time has been used, but in some
cases the computer will be allowed to search a slightly longer time.
Once try_move returns a value for the node, it is compared against
alpha and beta. If the value exceeds alpha, then a new best move has
been found at this node and the move is stored. If the value exceeds
beta, cutoff occurs and no more searching is done. A special check
is also made to see if the value indicates that mate has occurred,
since there is no point searching any further after that.
Assuming the principal variation move does not cause cutoff or mate,
then move_search proceeds to search the remainder of the moves.
These are searched with a minimal alpha-beta window. Note that since
the principal variation move is usually obtained from the hash table
or the ply0moves array, it may be the case that the move generator
has never had to be called before now. If so, we call it at this
time. An earlier version of Arasan tried to optimize things further
by only generating part of the moves at a time. That way, if cutoff
occurred, the remainder of the move generation could be skipped.
However, there was a cost to this in terms of complexity and it did
not seem to help search speed much.
The final part of move_search checks to see if checkmate or stalemate
occurred, and updates the hash table.
When the top-level invocation of move_search terminates, it returns
to find_best_move, which determines the time used, and updates the
"Statistics" structure with the time and other information about the
search.
5. The hash table
The search routine uses a hash table for storing the results of
evaluating previously visited positions. This table is implemented
using a template class (Hash) because the opening book generator uses
a similar hash table but stores somewhat different information in the
table. The hash table is basically an array of lists (class S_List).
Each list contains a series of nodes (S_Node), each of which contains
some data (in the case of the search engine, a class of type
Position_Info) and a pointer to the next node. Each list holds
entries that hash, modulo the hash table size, to the same value.
Each node contains the whole hash code, so that finding a given node
to match a given hash code consists of indexing into the hash table,
then following the list until the hash codes match.
Besides the hash code, each hash entry also contains the score for
the node, a set of flags indicating whether the value is exact, an
upper bound or a lower bound, the depth of search used to evaluate
the node, a word holding the castling status and en passant square,
and the best move for the position.
Under Windows, the number of lists in the hash table is configured
at runtime based on the amount of memory available. Under DOS,
the number of lists is fixed in size. In both cases, there is also
a limit on how many nodes can be entered into the table. The hash
table is cleared after each search.
Memory allocation and deallocation for the hash table can be quite
expensive, especially under Windows. To minimize this, operators
new and delete are overloaded for both S_List and S_Node classes.
These operators use a simple memory allocation scheme that obtains
memory from the OS in large chunks and does suballocation out of
the chunks (class Pool implements this). Memory freed via "delete"
goes into a free list and is immediately available for allocation
again via "new". Memory is not actually freed until the "freeAll"
method is called, which occurs only when the hash table is cleared.
Then the large memory chunks are returned to the OS.
6. Position Scoring
There are six main components to the positional score used by Arasan:
a. Center control
b. Development
c. Castling
d. Pawn structure
e. King safety
f. Threats
The positional score is typically within the range of plus or minus
the value of a pawn (64), but can be greater in some circumstances.
When in the endgame (determined by the amount of material on the board)
simplified and/or different scoring parameters are used. Also,
special scoring code is used for "mopping up" positions, in which
one side has a large material advantage. Chess 4.5 from Northwestern
University used a similar scheme.
Center control is mostly calculated by table lookup. For sliding
pieces, a check is made to be sure that the piece has an unobstructed
line of movement to the center of the board. This component of the
score is not used in the endgame.
The development score encourages the program to move its pieces from
the back rank, but discourages premature development of the queen.
A measure of piece mobility is also calculated for bishops and rooks.
Bonuses are awarded for a rook on the 7th rank, and also for a rook
on an open or half-open file, and for doubled rooks.
In the endgame, the development score encourages the king to move
towards the center or opposite side of the board, and to stay
near pawns. A bonus is also awarded for having the opposition.
This code is not very effective in producing good play, but it
is better than nothing. Special-case code is included for
KPK and KNBK endgames, which enables the program to play these
fairly well.
When "mopping up", the development score gives a bonus for restricting
the opposing king's mobility and for driving the opposing king to
an edge or corner of the board.
The castling score penalizes the program for being unable to castle,
either temporarily (because a square traversed by castling is under
attack) or permanently (because the king or rook has made another move).
The pawn structure score penalizes isolated, backward, and doubled
pawns, and gives a bonus for passed pawns. If a passed pawn exists,
its value depends on its rank and also on whether squares ahead of
it are occupied or controlled by the opposing side.
The king safety score evaluates the extent to which the king is
protected by pawns, and also the extent to which opposing pieces
attack squares near the king. It is pretty inexact, but does
at least alert the side to move when the king is in serious trouble.
The threat score penalizes the side to move for having pieces
that appear to be trapped or to be subject to profitable capture
("en prise"). A pinned piece is given the same penalty as a
trapped piece. Usually the search routine will extend the search
when such situations exist, but we have to do something when the
terminal search depth is reached.
Contacting the Author
I have been working on computer chess programming for around seven
years, and this is the second complete chess program I have written.
It is still very much imperfect - but I have decided to release it
in its present form, both so that others can enjoy playing with it
and so that programmers can study it, learn from it, and maybe
improve it. I don't guarantee any support for this program. However,
if you do find bugs in it, or discover a way to improve it, I would
like to hear from you. I can be reached at:
Jon Dart
553 N. 17th St.
San Jose, CA 95112
Or via Internet at jdart@rational.com.
References
Frey, Peter W. (ed.) (1983). Chess Skill in Man and Machine.
New York: Springer-Verlag.
Marsand, T. Anthony and Schaeffer, Jonathan (1990). Computers,
Chess and Cognition. New York: Springer-Verlag.